Net7: a new tool for bacterial comparative genomics: massive tracing of vertical and horizontal gene flux between genome elements

نویسندگان

  • Marina Manrique
  • Pablo Pareja-Tobes
  • Eduardo Pareja-Tobes
  • Marta Brozynska
  • Eduardo Pareja
  • Raquel Tobes
چکیده

Identification and epidemiologic studies of pathogenic bacteria are mainly based on genotyping of a set of selected genes. The availability of Next Generation Sequencing technologies allows doing an exhaustive genotyping of the strains involved in outbreaks sequencing their whole genome. To have the sequences of the complete set of genes of a bacteria opens new strategies of analysis that can provide insight into their provenance and also into their possible evolution in the next future. The knowledge in these two directions can delineate the strategies of intervention including from prevention to treatment and epidemiologic surveillance.We have developed a tool to detect the complete set of similarity relationships of each protein from a genome element with all bacterial and archaeal proteins present in Uniprot, and hence with all the genome elements and taxonomic units to which the connected proteins pertain. This new tool for bacterial comparative genomics allows a massive tracing of vertical and horizontal gene flux between genome elements, based on the analysis of the similarity between their proteins. The tool analyzes similarity relationships that can be fixed to 90% or 100% of similarity threshold. The tool provides the data needed to obtain network representations with Hiveplot or gephi.The building of the network is based on Bio4j (http://bio4j.com/). Bio4j is a bioinformatics graph based DB including most data available in UniProt KB (SwissProt + Trembl), Gene Ontology (GO), UniRef (50,90,100), RefSeq, NCBI taxonomy, and Expasy Enzyme DB developed by Era7 Bioinformatics research group Oh no sequences!. The current version of Bio4j (0.7) includes 530.642.683 relationships and 76.071.411 nodes. Bio4j uses Neo4j technology, another Open Source project. Performance is one of the main advantages of the platform. In Bio4j data is organized in a way semantically equivalent to what it represents thanks to the graph structure. That means that queries which would even be impossible to perform with a standard Relational DB, just take a couple of seconds with Bio4j. Bio4j is an open source platform released under AGPLv3.Bio4j is freely available. Net7 will also be released under AGPLv3 Open Source license. IWBBIO 2013. Proceedings Granada, 18-20 March, 2013 279

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Horizontal gene transfer depends on gene content of the host

Horizontal gene transfer is a major contributor to the evolution of bacterial genomes. We examine this process through a combination of comparative genomics and in silico analysis of the Escherichia coli metabolic network. We validate our horizontal transfer estimates by confirming the predicted gradual amelioration of GC content over time. We find that the chance of acquiring a gene by horizon...

متن کامل

Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world

The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generali...

متن کامل

BPGA- an ultra-fast pan-genome analysis pipeline.

Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique...

متن کامل

Pervasive domestication of defective prophages by bacteria.

Integrated phages (prophages) are major contributors to the diversity of bacterial gene repertoires. Domestication of their components is thought to have endowed bacteria with molecular systems involved in secretion, defense, warfare, and gene transfer. However, the rates and mechanisms of domestication remain unknown. We used comparative genomics to study the evolution of prophages within the ...

متن کامل

Tracing Lifestyle Adaptation in Prokaryotic Genomes

Lifestyle adaptation of microbes due to changes in their ecological niches or acquisition of new environments is a major driving force for genetic changes in their respective genomes. Moving into more specialized niches often results in the acquisition of new gene sets via horizontal gene transfer to utilize previously unavailable metabolites, while genetic ballast is shed by gene loss and/or g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013